marginal value
OptMap: Geometric Map Distillation via Submodular Maximization
Thorne, David, Chan, Nathan, Robison, Christa S., Osteen, Philip R., Lopez, Brett T.
Abstract--Autonomous robots rely on geometric maps to inform a diverse set of perception and decision-making algorithms. As autonomy requires reasoning and planning on multiple scales of the environment, each algorithm may require a different map for optimal performance. Light Detection And Ranging (LiDAR) sensors generate an abundance of geometric data to satisfy these diverse requirements, but selecting informative, size-constrained maps is computationally challenging as it requires solving an NP-hard combinatorial optimization. In this work we present OptMap: a geometric map distillation algorithm which achieves real-time, application-specific map generation via multiple theoretical and algorithmic innovations. A central feature is the maximization of set functions that exhibit diminishing returns, i.e., submodularity, using polynomial-time algorithms with provably near-optimal solutions. We formulate a novel submodular reward function which quantifies informativeness, reduces input set sizes, and minimizes bias in sequentially collected datasets. Further, we propose a dynamically reordered streaming submod-ular algorithm which improves empirical solution quality and addresses input order bias via an online approximation of the value of all scans. T esting was conducted on open-source and custom datasets with an emphasis on long-duration mapping sessions, highlighting OptMap's minimal computation requirements. Open-source ROS1 and ROS2 packages are available and can be used alongside any LiDAR SLAM algorithm. ODERN autonomous systems use a modular software architecture with separate algorithms for perceiving the environment, planning collision-free paths, estimating vehicle motion, and making higher-level decisions to complete their tasks. Many of these algorithms depend on geometric information about the environment to function properly. As a result, their performance and processing time can vary greatly depending on the quality of the geometric data. For example, trajectory planners use geometric maps to plan collision-free paths, but the density of geometric data is critical for balancing real-time replanning requirements against reliable collision detection. This trade-off is best served by dense geometric maps that specifically capture the intended trajectory corridor (Figure 1a). In contrast, localization entails aligning a source and reference point cloud, a process best served by using a sparse and global reference point could to minimize computation time while maximizing alignment accuracy (Figure 1b). Distribution Statement A: Approved for public release; distribution is unlimited. Map is dense while remaining efficient as only points near the intended trajectory are returned.
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- Oceania > New Zealand > South Island > Marlborough District > Blenheim (0.04)
- North America > United States > Maryland > Prince George's County > Adelphi (0.04)
- (4 more...)
The Marginal Value of Adaptive Gradient Methods in Machine Learning
Adaptive optimization methods, which perform local optimization with a metric constructed from the history of iterates, are becoming increasingly popular for training deep neural networks. Examples include AdaGrad, RMSProp, and Adam. We show that for simple overparameterized problems, adaptive methods often find drastically different solutions than gradient descent (GD) or stochastic gradient descent (SGD). We construct an illustrative binary classification problem where the data is linearly separable, GD and SGD achieve zero test error, and AdaGrad, Adam, and RMSProp attain test errors arbitrarily close to half. We additionally study the empirical generalization capability of adaptive methods on several state-of-the-art deep learning models. We observe that the solutions found by adaptive methods generalize worse (often significantly worse) than SGD, even when these solutions have better training performance. These results suggest that practitioners should reconsider the use of adaptive methods to train neural networks.
- Europe > Switzerland > Zürich > Zürich (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
E-LDA: Toward Interpretable LDA Topic Models with Strong Guarantees in Logarithmic Parallel Time
In this paper, we provide the first practical algorithms with provable guarantees for the problem of inferring the topics assigned to each document in an LDA topic model. This is the primary inference problem for many applications of topic models in social science, data exploration, and causal inference settings. We obtain this result by showing a novel non-gradient-based, combinatorial approach to estimating topic models. This yields algorithms that converge to near-optimal posterior probability in logarithmic parallel computation time (adaptivity) -- exponentially faster than any known LDA algorithm. We also show that our approach can provide interpretability guarantees such that each learned topic is formally associated with a known keyword. Finally, we show that unlike alternatives, our approach can maintain the independence assumptions necessary to use the learned topic model for downstream causal inference methods that allow researchers to study topics as treatments. In terms of practical performance, our approach consistently returns solutions of higher semantic quality than solutions from state-of-the-art LDA algorithms, neural topic models, and LLM-based topic models across a diverse range of text datasets and evaluation parameters.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
- Asia > Middle East > Jordan (0.04)
- North America > United States > New Hampshire > Grafton County > Hanover (0.04)
- North America > Canada (0.04)
- Government > Regional Government (0.67)
- Media > News (0.45)
Reviews: The Marginal Value of Adaptive Gradient Methods in Machine Learning
Adaptive methods are based on metrics which evolve along the optimization process. Contrary to what happens for gradient descent, Nesterov's method or the heavy ball method, this may result in estimates which are outside of the linear span of past visited points and estimated gradients. These methods became very popular recently in a deep learning context. The main question adressed by the authors is to compare both categories of method. First the authors construct an easy classification example for which they prove that adaptive methods behave very badly while non adaptive methods achieve perfect accuracy. Second the authors report extensive numerical comparisons of the different classes of algorithms showing consistent superiority of non adaptive methods.
Submodular Optimization for Keyframe Selection & Usage in SLAM
Thorne, David, Chan, Nathan, Ma, Yanlong, Robison, Christa S., Osteen, Philip R., Lopez, Brett T.
Keyframes are LiDAR scans saved for future reference in Simultaneous Localization And Mapping (SLAM), but despite their central importance most algorithms leave choices of which scans to save and how to use them to wasteful heuristics. This work proposes two novel keyframe selection strategies for localization and map summarization, as well as a novel approach to submap generation which selects keyframes that best constrain localization. Our results show that online keyframe selection and submap generation reduce the number of saved keyframes and improve per scan computation time without compromising localization performance. We also present a map summarization feature for quickly capturing environments under strict map size constraints.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > Maryland > Prince George's County > Adelphi (0.04)
- Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
- Research Report > New Finding (0.54)
- Research Report > Promising Solution (0.34)
Marginals-to-Models Reducibility
We consider a number of classical and new computational problems regarding marginal distributions, and inference in models specifying a full joint distribution. We prove general and efficient reductions between a number of these problems, which demonstrate that algorithmic progress in inference automatically yields progress for "pure data" problems. Our main technique involves formulating the problems as linear programs, and proving that the dual separation oracle required by the ellipsoid method is provided by the target problem. This technique may be of independent interest in probabilistic inference.
- North America > United States > Pennsylvania (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- (2 more...)
Fast Distributed Submodular Cover: Public-Private Data Summarization
In this paper, we introduce the public-private framework of data summarization motivated by privacy concerns in personalized recommender systems and online social services. Such systems have usually access to massive data generated by a large pool of users. A major fraction of the data is public and is visible to (and can be used for) all users. However, each user can also contribute some private data that should not be shared with other users to ensure her privacy. The goal is to provide a succinct summary of massive dataset, ideally as small as possible, from which customized summaries can be built for each user, i.e. it can contain elements from the public data (for diversity) and users' private data (for personalization). To formalize the above challenge, we assume that the scoring function according to which a user evaluates the utility of her summary satisfies submodularity, a widely used notion in data summarization applications. Thus, we model the data summarization targeted to each user as an instance of a submodular cover problem. However, when the data is massive it is infeasible to use the centralized greedy algorithm to find a customized summary even for a single user. Moreover, for a large pool of users, it is too time consuming to find such summaries separately.
- Europe > Switzerland > Zürich > Zürich (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Handling Overlapping Asymmetric Datasets -- A Twice Penalized P-Spline Approach
McTeer, Matthew, Henderson, Robin, Anstee, Quentin M, Missier, Paolo
Overlapping asymmetric datasets are common in data science and pose questions of how they can be incorporated together into a predictive analysis. In healthcare datasets there is often a small amount of information that is available for a larger number of patients such as an electronic health record, however a small number of patients may have had extensive further testing. Common solutions such as missing data imputation can often be unwise if the smaller cohort is significantly different in scale to the larger sample, therefore the aim of this research is to develop a new method which can model the smaller cohort against a particular response, whilst considering the larger cohort also. Motivated by non-parametric models, and specifically flexible smoothing techniques via generalized additive models, we model a twice penalized P-Spline approximation method to firstly prevent over/under-fitting of the smaller cohort and secondly to consider the larger cohort. This second penalty is created through looking at discrepancies in the marginal value of covariates that exist in both the smaller and larger cohorts. Through a series of data simulations, penalty parameter tunings and model adaptations to consider both a continuous and binary response, we find that our twice penalized approach offers an enhanced model fit over a linear B-Spline model and once penalized P-Spline approximation method. Applying our twice penalized method to a real-life healthcare dataset relating to an individual's risk of developing Non-Alcoholic Steatohepatitis, we see an improved model fit performance of over 65% as opposed to linear and once penalized methods. Areas for future work within this space include adapting our method to not require dimensionality reduction and also consider parametric modelling methods. However, to our knowledge this is the first work to propose additional marginal penalties in a flexible regression of which we can report a vastly improved model fit that is able to consider asymmetric datasets, without the need for missing data imputation.
Fast Adaptive Non-Monotone Submodular Maximization Subject to a Knapsack Constraint
Amanatidis, Georgios, Fusco, Federico, Lazos, Philip, Leonardi, Stefano, Reiffenhäuser, Rebecca
Constrained submodular maximization is a fundamental problem at the heart of discrete optimization. The reason for this is as simple as it is clear: submodular functions capture the notion of diminishing returns present in a wide variety of real-world settings. Consequently to its striking importance and coinciding NP-hardness [20], extensive research has been conducted on submodular maximization since the seventies (e.g., [15, 42]), with focus lately shifting towards handling the massive datasets emerging in modern applications. With a wide variety of possible constraints, often regarding cardinality, independence in a matroid, or knapsacktype restrictions, the number of applications is vast. To name just a few, there are recent works on feature selection in machine learning [13, 14, 32], influence maximization in viral marketing [2, 31], and data summarization [43, 38, 45]. Many of these applications have non-monotone submodular objectives, meaning that adding an element to an existing set might actually decrease its value. Two such examples are discussed in detail in Section 5. This work was supported by the ERC Advanced Grant 788893 AMDROMA "Algorithmic and Mechanism Design Research in Online Markets" and the MIUR PRIN project ALGADIMAR "Algorithms, Games, and Digital Markets."
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- North America > United States > New York > New York County > New York City (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- (25 more...)